Which bridge estimator is optimal for variable selection?

نویسندگان

Shuaiwen Wang

Haolei Weng

Arian Maleki

چکیده

We study the problem of variable selection for linear models under the high-dimensional asymptotic setting, where the number of observations n grows at the same rate as the number of predictors p. We consider two-stage variable selection techniques (TVS) in which the first stage uses bridge estimators to obtain an estimate of the regression coefficients, and the second stage simply thresholds the regression coefficients estimate to select the “important” predictors. The asymptotic false discovery proportion (AFDP) and true positive proportion (ATPP) of these TVS are evaluated. We prove that for a fixed ATTP, in order to obtain the smallest AFDP one should pick an estimator that minimizes the asymptotic mean square error in the first stage of TVS. This simple observation enables us to evaluate and compare the performances of different TVS with each other and with some standard variable selection techniques, such as LASSO and Sure Independence Screening. For instance, we prove that a TVS with LASSO in its first stage can outperform LASSO (only one stage) in a large range of ATTP. Furthermore, we will show that for large values of noise, a TVS with ridge in its first stage outperforms TVS with other bridge estimators including the one that has LASSO in its first stage.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection in High-Dimensional Classification

High-dimensional discriminant analysis is of fundamental importance in multivariate statistics. Existing theoretical results sharply characterize different procedures, providing sharp convergence results for the classification risk, as well as the l2 convergence results to the discriminative rule. However, sharp theoretical results for the problem of variable selection have not been established...

متن کامل

Variable selection in the accelerated failure time model via the bridge method.

In high throughput genomic studies, an important goal is to identify a small number of genomic markers that are associated with development and progression of diseases. A representative example is microarray prognostic studies, where the goal is to identify genes whose expressions are associated with disease free or overall survival. Because of the high dimensionality of gene expression data, s...

متن کامل

OPTIMAL SELECTION OF NUMBER OF RAINFALL GAUGING STATIONS BY KRIGING AND GENETIC ALGORITHM METHODS

In this study, optimum combinations of available rainfall gauging stations are selected by a model which is consist of geo statistics model as an estimator  and an optimized model. At the  first,  watershed  is  approximated  to  several  regular  geometric  shapes.  Then  kriging calculates  the  variance &nbs...

متن کامل

Variable selection for optimal treatment decision.

In decision-making on optimal treatment strategies, it is of great importance to identify variables that are involved in the decision rule, i.e. those interacting with the treatment. Effective variable selection helps to improve the prediction accuracy and enhance the interpretability of the decision rule. We propose a new penalized regression framework which can simultaneously estimate the opt...

متن کامل

Discrete-time repetitive optimal control: Robotic manipulators

This paper proposes a discrete-time repetitive optimal control of electrically driven robotic manipulators using an uncertainty estimator. The proposed control method can be used for performing repetitive motion, which covers many industrial applications of robotic manipulators. This kind of control law is in the class of torque-based control in which the joint torques are generated by permanen...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1705.08617 شماره

صفحات -

تاریخ انتشار 2017

Which bridge estimator is optimal for variable selection?

نویسندگان

چکیده

منابع مشابه

Feature Selection in High-Dimensional Classification

Variable selection in the accelerated failure time model via the bridge method.

OPTIMAL SELECTION OF NUMBER OF RAINFALL GAUGING STATIONS BY KRIGING AND GENETIC ALGORITHM METHODS

Variable selection for optimal treatment decision.

Discrete-time repetitive optimal control: Robotic manipulators

عنوان ژورنال:

اشتراک گذاری